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3 


<110> 


APPLICANT 


: DSM IP 


ASSETS B.V. 


5 


<120> 


TITLE OF 


INVENTION: SQS gene 


7 


<130> 


FILE REFERENCE: NDR5218 


9 


<140> 


CURRENT APPLICATION NUMBER: US/10/528,872 


9 


<141> 


CURRENT FILING DATE: 2005-03-23 


9 


<150> 


PRIOR APPLICATION 


NUMBER: PCT/EP03/10573 


10 


<151> 


PRIOR FILING DATE: 


2003-09-23 


12 


<150> 


PRIOR APPLICATION 


NUMBER: EP 02021619.8 


13 


<151> 


PRIOR FILING DATE: 


2002-09-27 


15 


<160> 


NUMBER OF 


SEQ ID NOS: 8 


17 


<170> 


SOFTWARE : 


PatentIn version 3.2 


19 


<210> 


SEQ ID NO 


: 1 




20 


<211> 


LENGTH: 4807 




21 


<212> 


TYPE: DNA 




\ 


22 


<213> 


ORGANISM: 


Phaffia, 


rhodozyma 


25 


<220> 


FEATURE : 






26 


<221> 


NAME /KEY: 


5 *UTR 




27 


<222> 


LOCATION: 


(1469) . . 


(1470) 


29 


<220> 


FEATURE : 






30 


<221> 


NAME /KEY: 


exon 




31 


<222> 


LOCATION : 


(1550) . . 


(1577) 


33 


<220> 


FEATURE : 






34 


<221> 


NAME /KEY: 


Intron 




35 


<222> 


LOCATION : 


(1578) . . 


(1752) 


37 


<220> 


FEATURE : 






38 


<221> 


NAME/KEY: 


exon 




39 


<222> 


LOCATION : 


(1753) . . 


(1766) 


41 


<220> 


FEATURE : 






42 


<221> 


NAME/ KEY: 


Intron 




43 


<222> 


LOCATION: 


(1767) . . 


(1882) 


45 


<220> 


FEATURE : 






46 


<221> 


NAME /KEY: 


exon 




47 


<222> 


LOCATION: 


(1883) . . 


(2071) 


49 


<220> 


FEATURE : 






50 


<221> 


NAME /KEY: 


Intron 




51 


<222> 


LOCATION: 


(2072) . . 


(2182) 


53 


<220> 


FEATURE : 






54 


<221> 


NAME /KEY: 


exon 




55 


<222> 


LOCATION: 


(2183) . . 


(2397) 


57 


<220> 


FEATURE : 






58 


<221> 


NAME/KEY: 


Intron 




59 


<222> 


LOCATION: 


(2398) . . 


(2474) 


61 


<220> 


FEATURE : 
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exon 

(3454) . . (3475) 



62 <221> NAME/KEY: exon 

63 <222> LOCATION: (2475) .. (3087) 

65 <220> FEATURE: 

66 <221> NAME/KEY: Intron 

67 <222> LOCATION: (3088) .. (3230) 

69 <220> FEATURE: 

70 <221> NAME/KEY: exon 

71 <222> LOCATION: (3231) (3356) 

73 <220> FEATURE: 

74 <221> NAME/KEY: Intron 

75 <222> LOCATION: (3357) . . (3453) 

77 <220> FEATURE: 

78 <221> NAME/KEY: 

79 <222> LOCATION: 

81 <220> FEATURE: 

82 <221> NAME/KEY: Intron 

83 <222> LOCATION: (3476) . . (3564) 

85 <220> FEATURE: 

86 <221> NAME/KEY: exon 

87 <222> LOCATION: (3565) , , (3881) 

89 <220> FEATURE: 

90 <221> NAME/KEY: Intron 

91 <222> LOCATION: (3882 ) . . (3 958 ) 

93 <220> FEATURE: 

94 <221> NAME/KEY: exon 

95 <222> LOCATION: (3959) (3970) 

97 <220> FEATURE: 

98 <221> 'NAME/KEY: 

99 <222> LOCATION: 

101 <400> SEQUENCE: 

102 gttcctgttc agtcaaagag tgggaaaaac 
104 ggtcagaaca tcggagatac aatggcccat 
106 gtgaggtttg cctaggaagt aatcccttcg 
108 gatgaacgac atgtcgaacc catctccatc 
110 ttccagcttt tctgctctct ccagtttcgc 
112 agtcgatgtt ctgtcgacag gagaccagta 
114 ggaggacagg gtggctttaa caaatcggta 
116 tcgaaggttg actcctcttg ctatgtgtat 
118 atttcctttt ctttctaccc ggagagtaag 
12 0 ctgaccgatc cgaatatcta gcgcaggttg 
122 aggttcatgt ttgaaagcat tgatcctagt 
124 agcaggctca atgatcactt ggggtttgtg 
126 gtcgagattc ttttttcttc ttttggtcga 
128 gcgggggatc acccgcatat taagcggtat 
130 tcataggtga aggttaaaac ggaatggata 
132 cgacttgggc agcctcgtcc atagtgtctg 
134 tggcgggttc gtcatggccg tgatcatctg 
136 aatgacagtt tcccgacgcc atcactaaga 
138 actgaagaag gtagggtctc gtcgagccag 



polyA_site 
(4106) . , (4107) 
1 



atgaaagtaa 
agaggaagga 
tttctcaaag 
ctcgaaatca 
agctttctct 
gaaggcggaa 
gtacggagga 
gagagcatat 
acacacaaag 
cttctctact 
tgcctctatc 
catcttgatg 
gaaaaaaaaa 
gacgctcatc 
ggaggagcta 
atggttatat 
ctttgttaga 
cacaaacgta 
tgcaaccaga 



aaagatgtaa 
aagctactta 
atatcttttt 
agtttactcg 
tcgggaagaa 
ccgacaattt 
tcgaacggcg 
ccgttgatgt 
aatcacgaag 
ggttccattc 
tgaggccagt 
ttcaaccaag 
cggcttcgct 
aaccggccaa 
accacgtttt 
cgtcatagaa 
cattgtccat 
tccagcacgc 
gttacagatg 



tgaaagaagg 
ccagaaacca 
tgaaagcatc 
atttagacct 
gctctccgcc 
tggatggatc 
cttctctcgt 
ctcagttaaa 
aatatgatga 
ttcgaacgat 
ctgccaatgt 
tgtcgcaacg 
tcgcacgcgc 
gtgttcttca 
tattttaatt 
aggcagcgcc 
cagtcacctc 
catgtccatc 
aacatcaggc 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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140 cttgatcaga cccgacttat gaatatggcc gttattgtac acttcttggt gctcctcgag 

142 ctgctctttc gtgtttttca ctttctttcc ggatcaaacg agactgctcg tgtatctatc 

144 tgtgcttgcc atatgagcat cccatgcctc tgctcaaatg atgctggagc tacgatccat 

146 cagagacgac acaaaacggg gttgtatgaa ctctacattt cctaatgtta ttggaatttt 

148 ctgtaatgcg ttcttcatct ttctctaatg cttttttgta gtccgtcttt tcaaccttgc 

150 cagcgtttcg cgtgtcttct ttctcctttg acggtcatca ctttcttctc tcttctcgtt 

152 ctttcttccg tccttccttc tctctcttcg tctgaacatc agcatcatc atg ggc ata 

153 Met Gly He 

154 1 

156 tea gat tac etc gtt ctg g gtcagttctg tcttttgttt gattcttatc 

157 Ser Asp Tyr Leu Val Leu 

158 5 

160 ttcttgccgg cggtcgcctg tcttgggtat atcatcagca atgagaaaca tgatgttccc 

162 cccgcgtcaa tcactgacct tttggtcctc tacttctttc ctgtcgaatt gatcctgatt 

164 gatacgtgtg ccggctgctt aacag ct ttc acg cat cct gtaggtgttt 

165 Ala Phe Thr His Pro 

166 10 

168 tatcgtatgc ttcatgttga tgtttagtca cgcggactga cctggccggt tgattttctg 

170 tatgatcgct tgtgctaccg tctttcttgg aaatccttcc catcag gcc gat ctg 

171 Ala Asp Leu 

172 15 



174 


cga 


get 


tta 


atg 


cag 


tac 


gcg ate 


tgg 


cat 


gag 


cct 


cga agg aat 


ate 


175 


Arg 


Ala 


Leu 


Met 


Gin 


Tyr 


Ala He 


Trp 


His 


Glu 


Pro 


Arg Arg Asn 


He 


176 






20 








25 










30 




178 


act 


gca 


cag 


gag 


gaa 


cat 


gca aca 


tec 


ggt 


tgg 


gac 


cga gaa act 


atg 


179 


Thr 


Ala 


Gin 


Glu 


Glu 


His 


Ala Thr 


Ser 


Gly 


Trp 


Asp 


Arg Glu Thr 


Met 


180 




35 










40 








45 






182 


aag 


gaa 


tgt 


tgg 


aag 


tat 


ttg gat 


ctg 


act 


tea 


aga 


agt ttc gca 


get 


183 


Lys 


Glu 


Cys 


Trp 


Lys 


Tyr 


Leu Asp 


Leu 


Thr 


Ser 


Arg 


Ser Phe Ala 


Ala 


184 


50 










55 








60 






65 


186 


gtc 


ate 


aaa 


gag 


ttg 


gac 


gga gat 


ctt 


acc 


cga 


gtc 


gtacgtgttt 




187 


Val 


He 


Lys 


Glu 


Leu 


Asp 


Gly Asp 


Leu 


Thr 


Arg 


Val 






188 










70 








75 











190 tcatettctc tctcctttga gatctggteg cctccgcatt ttcttgttgc agaagggtca 

192 gaagctgaca acaccatctc tactgttcgg gacacggcta g ate tgt tta ttc tat 

193 He Cys Leu Phe Tyr 

194 80 



196 


etc 


get 


ctt 


cga 


gga 


ctg gat ace att 


gag 


gat 


gac 


atg 


agt 


eta 


tct 


197 


Leu 


Ala 


Leu 


Arg 


Gly 


Leu Asp Thr He 


Glu 


Asp 


Asp 


Met 


Ser 


Leu 


Ser 


198 






85 






90 








95 








200 


aat 


gat 


gtg 


aag 


ctt 


cec ctg ctt egg 


aca 


ttc 


tgg 


gaa 


aag 


ctt 


gac 


201 


Asn 


Asp 


Val 


Lys 


Leu 


Pro Leu Leu Arg 


Thr 


Phe 


Trp 


Glu 


Lys 


Leu 


Asp 


202 




100 








105 






110 










204 


tec 


cct 


ggg 


tgg 


acc 


ttt act gga tec 


ggt 


eca 


aat 


gag 


aag 


gat 


aga 


205 


Ser 


Pro 


Gly 


Trp 


Thr 


Phe Thr Gly Ser 


Gly 


Pro 


Asn 


Glu 


Lys 


Asp 


Arg 


206 


115 










120 




125 










130 


208 


gag 


ctt 


ctt 


gtt 


cac 


ttc gat gtg gcc 


ate 


gcc 


gag 


ttt 


gcc 


aac 


ttg 


209 


Glu 


Leu 


Leu 


Val 


His 


Phe Asp Val Ala 


He 


Ala 


Glu 


Phe 


Ala 


Asn 


Leu 



210 135 140 145 



1200 
1260 
1320 
1380 
1440 
1500 
1558 



1607 



1667 
1727 
1776 



1836 
1891 



1939 



1987 



2035 



2081 



2141 
2197 



2245 



2293 



2341 



2389 
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212 


gac 


gtc 


aa 


gtgagtttcc ctttatggtt ggatcateeg ctegacagae 






2437 


213 


Asp 


Val 


Asn 






























216 


tcgaaacgct catcactttg gtctgcttga tgaacag c tct egg aac gtc att 


2490 


217 






















Ser Arg Asn Val He 




218 






















150 












220 


cga 


gac 


ate 


act 


cge 


aag 


atg 


ggt 


aac 


ggt 


atg 


gee 


gac 


ttt 


get 


tct 


2538 


221 


Arg 


Asp 


He 


Thr 


Arg 


Lys 


Met 


Gly 


Asn 


Gly 


Met 


Ala 


Asp 


Phe 


Ala 


Ser 




222 


155 










160 










165 










170 




224 


etc 


tct 


aeg 


cec 


tec 


aag 


cet 


gtg 


gee 


gag 


gtc 


cag 


teg 


ace 


gaa 


gat 


2586 


225 


Leu 


Ser 


Thr 


Pro 


Ser 


Lys 


Pro 


Val 


Ala 


Glu 


Val 


Gin 


Ser 


Thr 


Glu 


Asp 




226 










175 










180 










185 






228 


tte 


aac 


eta 


tac 


tgt 


eat 


tac 


gtc 


get 


gga 


etc 


gtc 


ggc 


gag 


gga 


etc 


2634 


229 


Phe 


Asn 


Leu 


Tyr 


Cys 


His 


Tyr 


Val 


Ala 


Gly 


Leu 


Val 


Gly 


Glu 


Gly 


Leu 




230 








190 










195 










200 








232 


tec 


cga 


etc 


ttt 


gtc 


gcg 


ace 


gag 


aag 


gaa 


cga 


eca 


tte 


ttg 


gee 


aac 


2682 


233 


Ser 


Arg 


Leu 


Phe 


Val 


Ala 


Thr 


Glu 


Lys 


Glu 


Arg 


Pro 


Phe 


Leu 


Ala 


Asn 




234 






205 










210 










215 










236 


cag 


atg 


gta 


ctt 


tea 


aac 


teg 


tte 


gga 


etc 


ctt 


etc 


caa 


aag 


aca 


aac 


2730 


237 


Gin 


Met 


Val 


Leu 


Ser 


Asn 


Ser 


Phe 


Gly 


Leu 


Leu 


Leu 


Gin 


Lys 


Thr 


Asn 




238 




220 










225 










230 












240 


ate 


ctt 


cga 


gat 


att 


egg 


gag 


gac 


gee 


gac 


gaa 


ggt 


egt 


ggc 


tte 


tgg 


2778 


241 


He 


Leu 


Arg 


Asp 


He 


Arg 


Glu 


Asp 


Ala 


Asp 


Glu 


Gly 


Arg 


Gly 


Phe 


Trp 




242 


235 










240 










245 










250 




244 


eca 


aga 


gag 


ate 


tgg 


gee 


aac 


ccg 


ate 


tat 


act 


gcg 


cat 


gca 


ccg 


ggc 


2826 


245 


Pro 


Arg 


Glu 


He 


Trp 


Ala 


Asn 


Pro 


He 


Tyr 


Thr 


Ala 


His 


Ala 


Pro 


Gly 




246 










255 










260 










265 






248 


aca 


agg 


ttt 


aac 


teg 


ttg 


act 


gac 


etg 


gtc 


aag 


aaa 


gaa 


aac 


ate 


gac 


2874 


249 


Thr 


Arg 


Phe 


Asn 


Ser 


Leu 


Thr 


Asp 


Leu 


Val 


Lys 


Lys 


Glu 


Asn 


He 


Asp 




250 








270 










275 










280 








252 


aaa 


gga 


tea 


atg 


tgg 


gtg 


ttg 


agt 


gcg 


atg 


aca 


etc 


gac 


gcg 


ate 


ace 


2922 


253 


Lys 


Gly 


Ser 


Met 


Trp 


Val 


Leu 


Ser 


Ala 


Met 


Thr 


Leu 


Asp 


Ala 


He 


Thr 




254 






285 










290 










295 










256 


cat 


act 


ace 


gac 


gca 


etg 


gac 


tac 


etc 


tea 


ctt 


eta 


aag 


aac 


cag 


agt 


2970 


257 


His 


Thr 


Thr 


Asp 


Ala 


Leu 


Asp 


Tyr 


Leu 


Ser 


Leu 


Leu 


Lys 


Asn 


Gin 


Ser 




258 




300 










305 










310 












260 


gtt 


tte 


aac 


ttt 


tgt 


get 


ate 


ccg 


get 


gtc 


atg 


teg 


att 


gca 


aeg 


ttg 


3018 


261 


Val 


Phe 


Asn 


Phe 


Cys 


Ala 


He 


Pro 


Ala 


Val 


Met 


Ser 


He 


Ala 


Thr 


Leu 




262 


315 










320 










325 










330 




264 


gag 


eta 


tge 


tte 


atg 


aac 


eca 


gcg 


gtg 


tte 


caa 


cga 


aac 


ata 


aaa 


ate 


3066 


265 


Glu 


Leu 


Cys 


Phe 


Met 


Asn 


Pro 


Ala 


Val 


Phe 


Gin 


Arg 


Asn 


He 


Lys 


He 




266 










335 










340 










345 






268 


aga 


aag 


gga 


gaa 


gee 


gtc 


gag 


gtgegttegc gcgttetgtt tetacettte 


3117 


269 


Arg 


Lys 


Gly 


Glu 


Ala 


Val 


Glu 






















270 








350 




























272 


ataacattgg aggttettga etcttaageg tettceaatc 


tgatgectec aattatcatc 


3177 


274 


atttttgtet tttttgcttt cetettgttt ctttteggeg 


tgatteaate cag etc 


3233 


275 






























Leu 




278 


att 


atg 


aag 


tgc 


aac 


aac 


cet 


egg 


gag 


gtg 


gca 


tac 


atg 


ttt 


aga 


gat 


3281 


279 


He 


Met 


Lys 


Cys 


Asn 


Asn 


Pro Arg 


Glu 


Val 


Ala 


Tyr 


Met 


Phe Arg Asp 
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280 355 360 365 370 

282 tat get cga aag att cat gcc aag get att cct aea gat cet aac tte 3329 

283 Tyr Ala Arg Lys lie His Ala Lys Ala lie Pro Thr Asp Pro Asn Phe 

284 375 380 385 

286 ate aag ttg age gtt geg tgt ggt ega gtgagttgat cgatcgatcc 3376 

287 lie Lys Leu Ser Val Ala Cys Gly Arg 

288 390 395 

290 atettttgtt ttgateatcg egagaettga etgategatt aeteaaaaea teategcttc 3436 

292 tccttcttgc tctctagf ate gaa caa tgg get gag cac t gtatgttcct 3485 

293 He Glu Gin Trp Ala Glu His 

294 400 

296 ccgcccctcc ttcaagtttc ctctcgcttc atctttgttg agaagaggga tctgatgtat 3545 

298 ctttctttgt tcggatcag ac tac eec tea ttt atg atg att egg eet teg 3596 

299 Tyr Tyr Pro Ser Phe Met Met He Arg Pro Ser 

300 405 410 

302 aat gae eet eaa aae eec gea cee tea aeg geg ett gae eet tte tea 3644 

303 Asn Asp Pro Gin Asn Pro Ala Pro Ser Thr Ala Leu Asp Pro Phe Ser 

304 415 420 425 

3 06 gga gae get cgt tta agg ata gcc tct aag aag get gag ate acc gcc 3692 

307 Gly Asp Ala Arg Leu Arg He Ala Ser Lys Lys Ala Glu He Thr Ala 

308 430 435 440 445 

310 get get ett gte agg aag aaa gcc egg gat cac get aag tgg aga gag 3740 

311 Ala Ala Leu Val Arg Lys Lys Ala Arg Asp His Ala Lys Trp Arg Glu 

312 450 455 460 

314 tec aag gga ttg eet eeg age gat ccg aae aag ccg gae aac teg gag 3788 

315 Ser Lys Gly Leu Pro Pro Ser Asp Pro Asn Lys Pro Asp Asn Ser Glu 

316 465 470 475 

318 gat gtt aat tgg gta ttg ate gge ggt atg ate gtt gga ttg ttg etc 3836 

319 Asp Val Asn Trp Val Leu He Gly Gly Met He Val Gly Leu Leu Leu 

320 480 485 490 

322 gtg atg ggc gtg etc ggt ttg get ate get tgg gtt gtt ett cag 3881 

323 Val Met Gly Val Leu Gly Leu Ala He Ala Trp Val Val Leu Gin 

324 495 500 505 

326 gtgcgttctt ccaaagagcc tttctctcat gaacacgcac ataggttgat ctaattctat 3941 

328 cttactctgt catacag ttt gag caa taa tctcaagatt ctagtccatc 3990 

329 Phe Glu Gin 

330 510 

332 etttegetea acgatctget tcttetcett etcettetee gtettctetg gtttctttte 4050 
334 ttactttctg ggatcttcct tcttgaatcc tccgatccaa tgtaatctgc ataccctcgc 4110 
336 tttagtagaa acegateett eattegatet tggegaaaat etaageaaag agaateaett 4170 
338 ttgtetaata aaatttcett taaagagteg getttttctt gtggegaage ttcatcecgt 4230 
340 cttcetctgg accatctctt ctcaatattc tttgtgctac tatatgatca agttctttga 4290 
342 aatcaaagaa gaacatgtat ttgattttga ggttccaaga atacaacegg eccaagtcgt 4350 
344 tettegeagt tttcatcaga cagcacatat ctctcctcct ctctatagaa gccgtatggg 4410 
346 gccaatcgac tctcatgggt agacegtgee ettttgacac ggggagaaag agaaegaaag 4470 
348 gaeaettgac egattegtta ataaagcegt ecccaecttt tctttaatgg caattcaaga 4530 
350 agagaaaaac aacccctgcg cgcactcgag tagtcgatca gaccttccga acgacagata 4590 
352 teatttgetg aaategaeeg gattttaaag etgetgeeag gteggtgaat eceeetaggt 4650 
354 gatetcettg tacaaagatg ttgggcaegg aettttegac ecggatgaga acgtcgtgaa 4710 
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Please Note: 

Use of n and/or Xaa have been detected in the Secjuence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223> fields of each sequence which presents at least one n or Xaa. 

Seq#:4; N Pes. 3,6,12,15 
Seq#:5; N Pos . 3,9,15,18,21 
Seq#:6; N Pes. 3,6,9,12,15 

Invalid <213> Response: 

Use of "Artificial'* only as "<213> Organism" response is incos^lete, 

per 1.823(b) of New Sequence Rules. Valid response is Artificial Sequence. 

Seq#:4,5,6,7,8 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/10/528,872 



DATE: 04/05/2005 
TIME: 12:11:20 



Input Set : A:\PTO.AMC.txt 

Output Set: N:\CRP4\04052005\J528872.raw 



L:9 M:270 C: Current Application Number differs, Replaced Current Application No 
L:9 M:27l C: Current Filing Date differs. Replaced Current Filing Date 
L:671 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:4 after pos . : 0 
L:709 M:341 W: (46) "n" or "Xaa" used, for SEQ ID#:5 after pos . : 0 
L:747 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# : 6 after pos . : 0 
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